Search CORE

31 research outputs found

Optimizing MapReduce for Multicore Architectures

Author: Kaashoek Frans
Mao Yandong
Morris Robert
Publication venue
Publication date: 01/01/2010
Field of study

MapReduce is a programming model for data-parallel programs originally intended for data centers. MapReduce simplifies parallel programming, hiding synchronization and task management. These properties make it a promising programming model for future processors with many cores, and existing MapReduce libraries such as Phoenix have demonstrated that applications written with MapReduce perform competitively with those written with Pthreads. This paper explores the design of the MapReduce data structures for grouping intermediate key/value pairs, which is often a performance bottleneck on multicore processors. The paper finds the best choice depends on workload characteristics, such as the number of keys used by the application, the degree of repetition of keys, etc. This paper also introduces a new MapReduce library, Metis, with a compromise data structure designed to perform well for most workloads. Experiments with the Phoenix benchmarks on a 16-core AMD-based servershow that Metisâ data structure performs better than simpler alternatives, including Phoenix

CiteSeerX

DSpace@MIT

Cache craftiness for fast multicore key-value storage

Author: Kohler Eddie
Mao Yandong
Morris Robert Tappan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/04/2012
Field of study

We present Masstree, a fast key-value database designed for SMP machines. Masstree keeps all data in memory. Its main data structure is a trie-like concatenation of B+-trees, each of which handles a fixed-length slice of a variable-length key. This structure effectively handles arbitrary-length possiblybinary keys, including keys with long shared prefixes. [superscript +]-tree fanout was chosen to minimize total DRAM delay when descending the tree and prefetching each tree node. Lookups use optimistic concurrency control, a read-copy-update-like technique, and do not write shared data structures; updates lock only affected nodes. Logging and checkpointing provide consistency and durability. Though some of these ideas appear elsewhere, Masstree is the first to combine them. We discuss design variants and their consequences. On a 16-core machine, with logging enabled and queries arriving over a network, Masstree executes more than six million simple queries per second. This performance is comparable to that of memcached, a non-persistent hash table server, and higher (often much higher) than that of VoltDB, MongoDB, and Redis.National Science Foundation (U.S.). (Award 0834415)National Science Foundation (U.S.). (Award 0915164)Quanta Computer (Firm

DSpace@MIT

Harvard University - DASH

Improved Wavelet Threshold De-noising Method Based on GNSS Deformation Monitoring Data

Author: Gao Yandong
Mao Yachun
Sun Shuang
Xu Maolin
Yang Fengyun
Publication venue: 'The Institute for Research and Community Services (LPPM) ITB'
Publication date: 01/01/2015
Field of study

In the process of GNSS deformation monitoring, it is inevitable that the monitoring data are contaminated by noise. Effectively mitigating the impact of noise on the measurements and thus improving the quality of the deformation data is the objective of GNSS data processing. Wavelet analysis can analyse the signal according to different frequencies of the signal. Simulation data can be used to determine the best wavelet basis function and select the appropriate decomposition level. In this paper, an improved threshold de-noising method is proposed, based on an analysis of conventional hard threshold de-noising, soft threshold de-noising and compulsory de-noising methods. The improved method was examined through a simulation analysis and applied in an engineering case. The results show that it effectively removed the noise at high frequencies while retaining data details and mutation. The de-noising ability of the proposed technique was better than that of the conventional methods. Moreover, this method significantly improved the quality of the deformation data

Neliti

Journal of Engineering and Technological Sciences

Directory of Open Access Journals

ITB Journal

Intertwined Ferroelectricity and Topological State in Two-Dimensional Multilayer

Author: Dai Ying
Huang Baibiao
Kou Liangzhi
Liang Yan
Ma Yandong
Mao Ning
Publication venue
Publication date: 14/02/2021
Field of study

The intertwined ferroelectricity and band topology will enable the non-volatile control of the topological states, which is of importance for nanoelectrics with low energy costing and high response speed. Nonetheless, the principle to design the novel system is unclear and the feasible approach to achieve the coexistence of two parameter orders is absent. Here, we propose a general paradigm to design 2D ferroelectric topological insulators by sliding topological multilayers on the basis of first-principles calculations. Taking trilayer Bi2Te3 as a model system, we show that in the van der Waals multilayer based 2D topological insulators, the in-plane and out-of-plane ferroelectricity can be induced through a specific interlayer sliding, to enable the coexistence of ferroelectric and topological orders. The strong coupling of the order parameters renders the topological states sensitive to polarization flip, realizing non-volatile ferroelectric control of topological properties. The revealed design-guideline and ferroelectric-topological coupling not only are useful for the fundamental research of the coupled ferroelectric and topological physics in 2D lattices, but also enable novel applications in nanodevices

arXiv.org e-Print Archive

Directory of Open Access Journals

Queensland University of Technology ePrints Archive

Linux Kernel Vulnerabilities: State-of-the-Art Defenses and Open Problems

Author: Chen Haogang
Kaashoek M. Frans
Mao Yandong
Wang Xi
Zeldovich Nickolai
Zhou Dong
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2011
Field of study

Avoiding kernel vulnerabilities is critical to achieving security of many systems, because the kernel is often part of the trusted computing base. This paper evaluates the current state-of-the-art with respect to kernel protection techniques, by presenting two case studies of Linux kernel vulnerabilities. First, this paper presents data on 141 Linux kernel vulnerabilities discovered from January 2010 to March 2011, and second, this paper examines how well state-of-the-art techniques address these vulnerabilities. The main findings are that techniques often protect against certain exploits of a vulnerability but leave other exploits of the same vulnerability open, and that no effective techniques exist to handle semantic vulnerabilities---violations of high-level security invariants.United States. Defense Advanced Research Projects Agency. Clean-slate design of Resilient, Adaptive, Secure Hosts (Contract #N66001-10-2-4089

DSpace@MIT

Crossref

Software Fault Isolation with Api Integrity and Multi-Principal Modules

Author: Chen Haogang
Kaashoek M. Frans
Mao Yandong
Wang Xi
Zeldovich Nickolai
Zhou Dong
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2011
Field of study

The security of many applications relies on the kernel being secure, but history suggests that kernel vulnerabilities are routinely discovered and exploited. In particular, exploitable vulnerabilities in kernel modules are common. This paper proposes LXFI, a system which isolates kernel modules from the core kernel so that vulnerabilities in kernel modules cannot lead to a privilege escalation attack. To safely give kernel modules access to complex kernel APIs, LXFI introduces the notion of API integrity, which captures the set of contracts assumed by an interface. To partition the privileges within a shared module, LXFI introduces module principals. Programmers specify principals and API integrity rules through capabilities and annotations. Using a compiler plugin, LXFI instruments the generated code to grant, check, and transfer capabilities between modules, according to the programmer's annotations. An evaluation with Linux shows that the annotations required on kernel functions to support a new module are moderate, and that LXFI is able to prevent three known privilege-escalation vulnerabilities. Stress tests of a network driver module also show that isolating this module using LXFI does not hurt TCP throughput but reduces UDP throughput by 35%, and increases CPU utilization by 2.2-3.7x.United States. Defense Advanced Research Projects Agency. Clean-slate design of Resilient, Adaptive, Secure Hosts (Contract number N66001-10-2-4089)National Science Foundation (U.S.). (Grant number CNS-1053143)National Basic Research Program of China (973 Program) (2007CB807901)National Natural Science Foundation (China) (61033001

CiteSeerX

DSpace@MIT

Crossref

An Analysis of Linux Scalability to Many Cores

Author: Boyd-Wickizer Silas
Clements Austin T.
Kaashoek M. Frans
Mao Yandong
Morris Robert Tappan
Pesterev Aleksey
Zeldovich Nickolai
Publication venue: USENIX Association
Publication date: 01/10/2010
Field of study

URL to paper from conference siteThis paper analyzes the scalability of seven system applications (Exim, memcached, Apache, PostgreSQL, gmake, Psearchy, and MapReduce) running on Linux on a 48- core computer. Except for gmake, all applications trigger scalability bottlenecks inside a recent Linux kernel. Using mostly standard parallel programming techniques— this paper introduces one new technique, sloppy counters— these bottlenecks can be removed from the kernel or avoided by changing the applications slightly. Modifying the kernel required in total 3002 lines of code changes. A speculative conclusion from this analysis is that there is no scalability reason to give up on traditional operating system organizations just yet.Quanta Computer (Firm)National Science Foundation (U.S.) (0834415)National Science Foundation (U.S.) (0915164)Microsoft Research (Fellowship)Irwin Mark Jacobs and Joan Klein Jacobs Presidential Fellowshi

DSpace@MIT

Recommended from our members

Easy Freshness with Pequod Cache Joins

Author: Kate Bryan
Kester Michael S
Kohler Eddie W
Mao Yandong
Morris Robert
Narula Neha
Publication venue: USENIX
Publication date: 21/09/2015
Field of study

Pequod is a distributed application-level key-value cache that supports declaratively defined, incrementally maintained, dynamic, partially-materialized views. These views, which we call cache joins, can simplify application development by shifting the burden of view maintenance onto the cache. Cache joins define relationships among key ranges; using cache joins, Pequod calculates views on demand, incrementally updates them as required, and in many cases improves performance by reducing client communication. To build Pequod, we had to design a view abstraction for volatile, relationless key-value caches and make it work across servers in a distributed system. Pequod performs as well as other inmemory key-value caches and, like those caches, outperforms databases with view support.Engineering and Applied Science

Harvard University - DASH

Fast in-memory storage systems : two aspects

Author: Mao Yandong
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2014
Field of study

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014.Cataloged from PDF version of thesis.Includes bibliographical references (pages 109-114).This dissertation addresses two challenges relating to in-memory storage systems. The first challenge is storing and retrieving data at a rate close to the capabilities of the underlying memory system, particularly in the face of parallel accesses from multiple cores. We present Masstree, a high performance in-memory key-value store that runs on a single multi-core server. Masstree is derived from a concurrent B+tree. It provides lock-free reads for good multi-core performance, which requires special care to avoid writes interfering with concurrent reads. To reduce time spent waiting for memory for workloads with long common key prefixes, Masstree arranges a set of B+trees into a Trie. Masstree uses software prefetch to further hide DRAM latency. Several optimizations improve concurrency. Masstree achieves millions of queries per second on a 16-core server, which is more than 30x as fast as MongoDB [6] or VoltDB [17]. The second challenge is replicating storage for fault-tolerance without being limited by slow writes to stable disk storage. Lazy VSR is a quorum-based replication protocol that is fast and can recover from simultaneous crashes of all the replicas as long as a majority revive with intact disks. The main idea is to acknowledge requests after recording them in memory, and to write updates to disk in the background, allowing large batched writes and thus good performance. A simultaneous crash of all replicas may leave the replicas with significantly different on-disk states; much of the design of Lazy VSR is concerned with reconciling these states efficiently during recovery. Lazy VSR's client-visible semantics are unusual in that the service may discard recent acknowledged updates if a majority of replicas crash. To demonstrate that clients can nevertheless make good use of Lazy VSR, we built a file system backend on it. Evaluation shows that Lazy VSR achieves much better performance than a version of itself with traditional group commit. Lazy VSR achieves 1.7 x the performance of ZooKeeper [42] and 3.6 x the performance of MongoDB [6].by Yandong Mao.Ph. D

DSpace@MIT

Method to Improve the Accuracy of Slope Monitoring Data Based on a Measuring Robot

Author: Gao Yandong
Mao Yachun
XU Maolin
Publication venue: 'International Association of Online Engineering (IAOE)'
Publication date: 25/01/2015
Field of study

To improve the accuracy of slope monitoring data based on a measuring robot, an effective and viable correction method is proposed. A 3D monitoring system based on a measuring robot and Geomos is utilized to collect data. The monitoring data of 44 cycles of Dagushan and Yanqianshan open-pit iron mines in Anshan City are employed as the data source. Large amounts of data are calculated and compared, and a quantitative analysis of various factors that influence the accuracy of measuring robot is performed. Data calculation shows that the proposed mathematical meteorological correction model and directional deviation correction method can effectively improve the accuracy of measuring robot. The corrected data can accurately represent the displacement of monitoring points, which provides important real-time warning of open-pit slope landslides. Methods to improve the accuracy of measuring robot are studied to increase data reliability. The nature of the data and the factors that affect the quality of data are analyzed

Online-Journals.org (International Association of Online Engineering)